Skip to content

Comments

fix: Enhance model management and transcription provider setup#292

Open
0xrushi wants to merge 2 commits intoSimpleOpenSoftware:devfrom
0xrushi:fix/cuda-vibevoice
Open

fix: Enhance model management and transcription provider setup#292
0xrushi wants to merge 2 commits intoSimpleOpenSoftware:devfrom
0xrushi:fix/cuda-vibevoice

Conversation

@0xrushi
Copy link
Contributor

@0xrushi 0xrushi commented Feb 7, 2026

  • Added add_or_update_model method in ConfigManager to facilitate adding or updating models in the configuration.
  • Updated ChronicleSetup to support a new OpenAI-Compatible transcription provider, allowing users to configure custom endpoints and API keys.
  • Enhanced user prompts for API base URL and model name during setup, improving the configuration experience.
  • Introduced unit tests for the new model management functionality and transcription provider setup, ensuring robust validation of the changes.
  • Improved Docker configurations for ASR services, including support for customizable CUDA versions and DNS settings.

Summary by CodeRabbit

  • New Features

    • Added support for OpenAI-compatible LLM providers with custom endpoint configuration (e.g., Groq, Together AI, LM Studio).
    • Enhanced CUDA version selection for GPU-accelerated audio recognition services.
  • Tests

    • Expanded test coverage for custom LLM provider configuration workflow.
    • Added validation tests for transcription service URL configuration.
    • Added CUDA version compatibility verification tests.

- Added `add_or_update_model` method in `ConfigManager` to facilitate adding or updating models in the configuration.
- Updated `ChronicleSetup` to support a new OpenAI-Compatible transcription provider, allowing users to configure custom endpoints and API keys.
- Enhanced user prompts for API base URL and model name during setup, improving the configuration experience.
- Introduced unit tests for the new model management functionality and transcription provider setup, ensuring robust validation of the changes.
- Improved Docker configurations for ASR services, including support for customizable CUDA versions and DNS settings.
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 7, 2026

📝 Walkthrough

Walkthrough

The changes introduce support for OpenAI-compatible custom LLM endpoints in the setup workflow, add a model management method to ConfigManager, implement CUDA version configuration for ASR Docker services, establish test infrastructure for configuration validation, and add comprehensive test coverage for LLM provider setup and URL resolution.

Changes

Cohort / File(s) Summary
Custom LLM Provider Setup
backends/advanced/init.py, config_manager.py
Adds interactive setup flow for OpenAI-compatible endpoints with API base URL and model name validation. Introduces add_or_update_model() method to ConfigManager for persisting custom model definitions to config.yml.
ASR CUDA Configuration
extras/asr-services/docker-compose.yml, extras/asr-services/providers/vibevoice/Dockerfile
Adds PYTORCH_CUDA_VERSION build argument with cu126 default to VibeVoice service and Dockerfile. Configures DNS for vibevoice-asr service. Enables CUDA-specific dependency installation during build.
Test Infrastructure
tests/libs/ConfigTestHelper.py, tests/test-requirements.txt
Introduces ConfigTestHelper class with utilities for configuration testing, including ConfigManager instantiation, model management wrappers, and OmegaConf resolution. Adds omegaconf and pyyaml dependencies.
Configuration and CUDA Tests
extras/asr-services/tests/test_cuda_version_config.py
Adds comprehensive test suite validating CUDA version declarations, environment variable handling, GPU architecture mappings (RTX 5090 requires cu128), and pyproject.toml CUDA extras consistency across Dockerfiles and docker-compose.
LLM Provider and URL Configuration Tests
tests/configuration/test_llm_custom_provider.robot, tests/configuration/test_transcription_url.robot
Adds Robot Framework tests for custom LLM model creation, updates, defaults synchronization, and multi-provider transcription URL resolution with environment variable overrides and fallback behavior.

Sequence Diagram(s)

sequenceDiagram
    actor User
    participant Init as setup_llm<br/>(init.py)
    participant Input as User Input<br/>Prompts
    participant ConfigMgr as ConfigManager
    participant Config as config.yml

    User->>Init: Run setup
    Init->>Input: Display provider options<br/>(1: OpenAI, 2: Ollama, 3: Custom, 4: Skip)
    Input->>User: Select option 3<br/>(Custom endpoint)
    User->>Input: Enter API base URL
    Input->>Init: Return base URL
    Init->>Input: Prompt for API key (optional)
    User->>Input: Provide or skip key
    Input->>Init: Return API key
    Init->>Input: Prompt for LLM model name
    User->>Input: Enter model name
    Input->>Init: Return model name
    Init->>ConfigMgr: Create model definition<br/>(provider, config, name)
    ConfigMgr->>Config: Load config.yml
    Config-->>ConfigMgr: Return current models
    ConfigMgr->>ConfigMgr: Add/update model entry
    ConfigMgr->>Config: Save updated models
    Config-->>ConfigMgr: Confirm save
    Init->>Input: Prompt for embedding model<br/>(optional)
    User->>Input: Enter or skip
    Input->>Init: Return embedding choice
    opt Embedding provided
        Init->>ConfigMgr: Create embedding model def
        ConfigMgr->>Config: Add embedding model
        Config-->>ConfigMgr: Confirm
    end
    Init->>ConfigMgr: Update defaults<br/>(llm, embedding refs)
    ConfigMgr->>Config: Update config defaults
    Config-->>ConfigMgr: Confirm
    Init->>User: Display success message<br/>with config details
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly summarizes the main changes: enhancing model management (add_or_update_model method) and transcription provider setup (OpenAI-Compatible provider).
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@AnkushMalaker AnkushMalaker self-assigned this Feb 15, 2026
@AnkushMalaker
Copy link
Collaborator

@coderabbitai review

@AnkushMalaker AnkushMalaker self-requested a review February 23, 2026 05:05
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 23, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

🧹 Nitpick comments (4)
backends/advanced/init.py (1)

382-397: Consider whether model_output: "json" is appropriate for all LLM providers.

The LLM model entry hardcodes "model_output": "json", which implies JSON-mode output. Some OpenAI-compatible providers may not support JSON mode. This is minor since users can edit config.yml afterward, but worth a note.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backends/advanced/init.py` around lines 382 - 397, The code hardcodes
"model_output": "json" in the llm_model dict used by
self.config_manager.add_or_update_model, which may be unsupported by some
OpenAI-compatible providers; change this to not force JSON by either omitting
the model_output key when unknown, defaulting to a safer "text" value, or making
it configurable (e.g., read from an env var like CUSTOM_LLM_MODEL_OUTPUT) before
building llm_model so add_or_update_model receives a compatible model_output or
none.
extras/asr-services/tests/test_cuda_version_config.py (1)

89-112: Regex for service section extraction may over-match or under-match in edge cases.

The regex rf"{service_name}:.*?(?=^\S|\Z)" with MULTILINE | DOTALL matches from service_name: to the next line starting with a non-whitespace character. In docker-compose YAML, comments (#) at the start of a line (like the section separators on lines 18, 54, 85, etc.) would prematurely terminate the match. For vibevoice-asr, the # ============ comment block at line 122 would correctly terminate it, but if any comment lines were added within a service block, they'd cause a premature cut.

This is fragile but currently works with the file's structure. Worth a note if the compose file format changes.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@extras/asr-services/tests/test_cuda_version_config.py` around lines 89 - 112,
The service-block regex is fragile to comment lines and should match only the
service header plus subsequent indented lines; in
test_docker_compose_cuda_arg_consistency replace the current service_match regex
(used to set service_match and service_text) with one that anchors the service
name at line start and captures following lines that begin with whitespace
(e.g., use a pattern like start-of-line service_name: followed by zero or more
lines that start with spaces/tabs) with re.MULTILINE (avoid DOTALL), so
service_text contains only the indented service block before the next top-level
key or EOF and the subsequent PYTORCH_CUDA_VERSION assertion remains valid.
extras/asr-services/docker-compose.yml (1)

117-119: Hardcoded Google DNS may not work in restricted network environments.

The dns directive applies at runtime (not just build time), so containers in corporate or air-gapped environments that block external DNS resolvers will fail. Consider making this configurable via an environment variable or documenting why it's needed specifically for vibevoice-asr but not for other services.

💡 Suggested approach
     dns:
-      - 8.8.8.8
-      - 8.8.4.4
+      - ${DNS_PRIMARY:-8.8.8.8}
+      - ${DNS_SECONDARY:-8.8.4.4}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@extras/asr-services/docker-compose.yml` around lines 117 - 119, The
docker-compose.yml currently hardcodes Google DNS under the dns directive for
the vibevoice-asr service; change this to be configurable via an environment
variable so containers in restricted networks can override it. Replace the
literal dns list for the vibevoice-asr service with a variable placeholder (e.g.
use DNS_SERVERS with a default like ${DNS_SERVERS:-8.8.8.8,8.8.4.4}) and update
README/compose usage to document the expected comma-separated format (or how to
pass a YAML list) and how to set DNS_SERVERS in the environment or .env; ensure
the docker-compose parsing handles splitting or provide examples for both
default and custom values so the dns directive is not permanently hardcoded.
tests/libs/ConfigTestHelper.py (1)

41-41: Move urlparse to the top-level imports.

Local imports of stdlib modules are non-idiomatic and add a subtle maintenance burden.

♻️ Proposed change

At the top of the file, alongside the other imports:

+from urllib.parse import urlparse

Then in the method body:

-        from urllib.parse import urlparse
-        parsed = urlparse(url)
+        parsed = urlparse(url)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/libs/ConfigTestHelper.py` at line 41, The local stdlib import "from
urllib.parse import urlparse" should be moved to the module's top-level imports
in ConfigTestHelper to follow idiomatic practice: add "from urllib.parse import
urlparse" to the file's import block alongside the other imports, remove the
inline/local import statement (the one currently at line containing "from
urllib.parse import urlparse"), and keep existing uses of urlparse in the
function/method unchanged (e.g., any usage inside methods of the
ConfigTestHelper class).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@backends/advanced/init.py`:
- Around line 404-416: The embed model dict in the branch guarded by
embedding_model_name sets a hardcoded "embedding_dimensions": 1536 which can
break if the user supplies a different embedding model; update the construction
of that dict (the embed_model creation) so embedding_dimensions is not hardcoded
but derived or configurable: add a lookup or helper (e.g.,
get_embedding_dimensions(embedding_model_name)) that returns a sensible default
(1536) but first checks an env/config value (e.g., CUSTOM_EMBED_DIM) or prompts
the user for dimensions, and use that value for the "embedding_dimensions" key
instead of the literal 1536.

In `@config_manager.py`:
- Around line 328-345: add_or_update_model currently assumes model_def contains
a "name" key and will raise a raw KeyError; add an explicit guard at the start
of add_or_update_model to validate that model_def is a dict and contains a
non-empty string under "name" (e.g., if not isinstance(model_def, dict) or not
model_def.get("name") or not isinstance(model_def["name"], str): raise
ValueError("model_def must include a non-empty 'name' string")), then proceed to
call _load_config_yml, update or append the model, and _save_config_yml as
before so callers get a clear, consistent error message instead of a KeyError.

In `@extras/asr-services/tests/test_cuda_version_config.py`:
- Around line 162-167: Update the inaccurate inline comment in the arch_to_cuda
mapping: change the comment for the "sm_90" key (currently marked "RTX 4090,
H100") to reflect that sm_90 corresponds to Hopper (H100/H200), and update or
clarify the "sm_89" comment to indicate sm_89 is Ada Lovelace (RTX 4090); edit
the comments next to the keys "sm_90" and "sm_89" in the arch_to_cuda dict
accordingly.
- Around line 172-179: The test test_older_gpus_work_with_cu121 is tautological
because it asserts each arch from older_archs is in the same hard-coded list;
replace this with a real validation by comparing older_archs to the
authoritative CUDA 12.1 supported list (e.g., a constant or helper like
CU121_SUPPORTED_ARCHS or get_supported_archs_for_cuda("12.1")). Update the test
to either assert set(older_archs).issubset(set(CU121_SUPPORTED_ARCHS)) or
iterate and assert each arch is in CU121_SUPPORTED_ARCHS so the test actually
verifies that CUDA 12.1 supports those architectures rather than re-checking the
source list. Ensure you reference the test name test_older_gpus_work_with_cu121
and the older_archs variable when making the change.

In `@tests/configuration/test_transcription_url.robot`:
- Around line 16-23: The test shows a double-http prefix caused because
backends/advanced/init.py sets default prompt values for PARAKEET_ASR_URL and
VIBEVOICE_ASR_URL with the "http://" scheme, while the config template also
prepends "http://"; fix by removing the "http://" scheme from those default
values in backends/advanced/init.py (update the default strings for
PARAKEET_ASR_URL and VIBEVOICE_ASR_URL to e.g. host.docker.internal:8767) so the
existing Resolve Omega Config/template which prefixes http:// produces a single
correct URL (references: PARAKEET_ASR_URL, VIBEVOICE_ASR_URL, and the config
template key model_url).

In `@tests/libs/ConfigTestHelper.py`:
- Around line 49-59: Add a symmetric cleanup helper to remove the temp config
directory created by create_temp_config_structure: implement a method named
cleanup_temp_config_structure(self, base_path: str) that imports shutil,
computes config_dir = Path(base_path) / "config" and calls
shutil.rmtree(config_dir, ignore_errors=True) so stale config.yml files are
removed even if tests fail; ensure the new method is documented and intended to
be called from test teardowns.
- Around line 29-35: The current resolve_omega_config helper patches os.environ
(via patch.dict) which is process-global and not thread-safe; instead, modify
resolve_omega_config to avoid patching os.environ and register an explicit
OmegaConf resolver that reads from the provided env_vars dict (falling back to
os.environ) so resolution is isolated per call: convert env_vars values to
strings, register a resolver (OmegaConf.register_resolver) or use a custom
function passed into OmegaConf that returns env_vars.get(key) ||
os.environ.get(key), create the config with OmegaConf.create(config_template),
then call OmegaConf.to_container(conf, resolve=True) and return the result;
ensure you unregister or scope the resolver per-call if necessary to avoid
global side effects.

---

Nitpick comments:
In `@backends/advanced/init.py`:
- Around line 382-397: The code hardcodes "model_output": "json" in the
llm_model dict used by self.config_manager.add_or_update_model, which may be
unsupported by some OpenAI-compatible providers; change this to not force JSON
by either omitting the model_output key when unknown, defaulting to a safer
"text" value, or making it configurable (e.g., read from an env var like
CUSTOM_LLM_MODEL_OUTPUT) before building llm_model so add_or_update_model
receives a compatible model_output or none.

In `@extras/asr-services/docker-compose.yml`:
- Around line 117-119: The docker-compose.yml currently hardcodes Google DNS
under the dns directive for the vibevoice-asr service; change this to be
configurable via an environment variable so containers in restricted networks
can override it. Replace the literal dns list for the vibevoice-asr service with
a variable placeholder (e.g. use DNS_SERVERS with a default like
${DNS_SERVERS:-8.8.8.8,8.8.4.4}) and update README/compose usage to document the
expected comma-separated format (or how to pass a YAML list) and how to set
DNS_SERVERS in the environment or .env; ensure the docker-compose parsing
handles splitting or provide examples for both default and custom values so the
dns directive is not permanently hardcoded.

In `@extras/asr-services/tests/test_cuda_version_config.py`:
- Around line 89-112: The service-block regex is fragile to comment lines and
should match only the service header plus subsequent indented lines; in
test_docker_compose_cuda_arg_consistency replace the current service_match regex
(used to set service_match and service_text) with one that anchors the service
name at line start and captures following lines that begin with whitespace
(e.g., use a pattern like start-of-line service_name: followed by zero or more
lines that start with spaces/tabs) with re.MULTILINE (avoid DOTALL), so
service_text contains only the indented service block before the next top-level
key or EOF and the subsequent PYTORCH_CUDA_VERSION assertion remains valid.

In `@tests/libs/ConfigTestHelper.py`:
- Line 41: The local stdlib import "from urllib.parse import urlparse" should be
moved to the module's top-level imports in ConfigTestHelper to follow idiomatic
practice: add "from urllib.parse import urlparse" to the file's import block
alongside the other imports, remove the inline/local import statement (the one
currently at line containing "from urllib.parse import urlparse"), and keep
existing uses of urlparse in the function/method unchanged (e.g., any usage
inside methods of the ConfigTestHelper class).

Comment on lines +404 to +416
if embedding_model_name:
embed_model = {
"name": "custom-embed",
"description": "Custom OpenAI-compatible embeddings",
"model_type": "embedding",
"model_provider": "openai",
"api_family": "openai",
"model_name": embedding_model_name,
"model_url": base_url,
"api_key": "${oc.env:CUSTOM_LLM_API_KEY,''}",
"embedding_dimensions": 1536,
"model_output": "vector"
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Hardcoded embedding_dimensions: 1536 may mismatch the user's chosen model.

The embedding dimensions are hardcoded to 1536 (matching OpenAI's text-embedding-3-small), but users could specify any embedding model. A dimension mismatch will cause vector store indexing failures at runtime. Consider prompting for the dimension or at minimum documenting the assumption.

🔧 Proposed fix — prompt for dimensions with a sensible default
                     if embedding_model_name:
+                        embed_dims_str = self.prompt_value(
+                            "Embedding dimensions (e.g., 1536 for text-embedding-3-small)", "1536"
+                        )
+                        try:
+                            embed_dims = int(embed_dims_str)
+                        except ValueError:
+                            self.console.print("[yellow][WARNING][/yellow] Invalid dimensions, using 1536")
+                            embed_dims = 1536
                         embed_model = {
                             "name": "custom-embed",
                             "description": "Custom OpenAI-compatible embeddings",
                             "model_type": "embedding",
                             "model_provider": "openai",
                             "api_family": "openai",
                             "model_name": embedding_model_name,
                             "model_url": base_url,
                             "api_key": "${oc.env:CUSTOM_LLM_API_KEY,''}",
-                            "embedding_dimensions": 1536,
+                            "embedding_dimensions": embed_dims,
                             "model_output": "vector"
                         }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backends/advanced/init.py` around lines 404 - 416, The embed model dict in
the branch guarded by embedding_model_name sets a hardcoded
"embedding_dimensions": 1536 which can break if the user supplies a different
embedding model; update the construction of that dict (the embed_model creation)
so embedding_dimensions is not hardcoded but derived or configurable: add a
lookup or helper (e.g., get_embedding_dimensions(embedding_model_name)) that
returns a sensible default (1536) but first checks an env/config value (e.g.,
CUSTOM_EMBED_DIM) or prompts the user for dimensions, and use that value for the
"embedding_dimensions" key instead of the literal 1536.

Comment on lines +328 to +345
def add_or_update_model(self, model_def: Dict[str, Any]):
"""
Add or update a model in the models list by name.

Args:
model_def: Model definition dict with at least a 'name' key.
"""
config = self._load_config_yml()
if "models" not in config:
config["models"] = []
# Update existing or append
for i, m in enumerate(config["models"]):
if m.get("name") == model_def["name"]:
config["models"][i] = model_def
break
else:
config["models"].append(model_def)
self._save_config_yml(config)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Missing validation for required name key in model_def.

If model_def lacks a "name" key, line 340 raises a raw KeyError. Other public methods in this class (e.g., set_memory_provider) validate inputs before operating. A guard here would produce a clearer error message and match the class's defensive style.

🛡️ Proposed fix
     def add_or_update_model(self, model_def: Dict[str, Any]):
         """
         Add or update a model in the models list by name.
 
         Args:
             model_def: Model definition dict with at least a 'name' key.
         """
+        if "name" not in model_def:
+            raise ValueError("model_def must contain a 'name' key")
         config = self._load_config_yml()
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@config_manager.py` around lines 328 - 345, add_or_update_model currently
assumes model_def contains a "name" key and will raise a raw KeyError; add an
explicit guard at the start of add_or_update_model to validate that model_def is
a dict and contains a non-empty string under "name" (e.g., if not
isinstance(model_def, dict) or not model_def.get("name") or not
isinstance(model_def["name"], str): raise ValueError("model_def must include a
non-empty 'name' string")), then proceed to call _load_config_yml, update or
append the model, and _save_config_yml as before so callers get a clear,
consistent error message instead of a KeyError.

Comment on lines +162 to +167
arch_to_cuda = {
"sm_120": "cu128", # RTX 5090, RTX 50 series
"sm_90": "cu126", # RTX 4090, H100
"sm_89": "cu121", # RTX 4090
"sm_86": "cu121", # RTX 3090, A6000
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Inaccurate comment: RTX 4090 is sm_89, not sm_90.

The mapping comment on line 164 says "sm_90": "cu126", # RTX 4090, H100 but RTX 4090 is Ada Lovelace (sm_89). sm_90 is Hopper (H100, H200). Line 165 correctly maps sm_89 to RTX 4090.

📝 Fix the comment
         arch_to_cuda = {
             "sm_120": "cu128",  # RTX 5090, RTX 50 series
-            "sm_90": "cu126",   # RTX 4090, H100
+            "sm_90": "cu126",   # H100, H200 (Hopper)
             "sm_89": "cu121",   # RTX 4090
             "sm_86": "cu121",   # RTX 3090, A6000
         }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
arch_to_cuda = {
"sm_120": "cu128", # RTX 5090, RTX 50 series
"sm_90": "cu126", # RTX 4090, H100
"sm_89": "cu121", # RTX 4090
"sm_86": "cu121", # RTX 3090, A6000
}
arch_to_cuda = {
"sm_120": "cu128", # RTX 5090, RTX 50 series
"sm_90": "cu126", # H100, H200 (Hopper)
"sm_89": "cu121", # RTX 4090
"sm_86": "cu121", # RTX 3090, A6000
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@extras/asr-services/tests/test_cuda_version_config.py` around lines 162 -
167, Update the inaccurate inline comment in the arch_to_cuda mapping: change
the comment for the "sm_90" key (currently marked "RTX 4090, H100") to reflect
that sm_90 corresponds to Hopper (H100/H200), and update or clarify the "sm_89"
comment to indicate sm_89 is Ada Lovelace (RTX 4090); edit the comments next to
the keys "sm_90" and "sm_89" in the arch_to_cuda dict accordingly.

Comment on lines +172 to +179
def test_older_gpus_work_with_cu121(self):
"""Test that older GPUs (sm_86, sm_80) work with cu121."""
older_archs = ["sm_86", "sm_80", "sm_75"] # RTX 3090, A100, RTX 2080

for arch in older_archs:
# cu121 supports these architectures
assert arch in ["sm_75", "sm_80", "sm_86"], \
f"{arch} should be supported by CUDA 12.1"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Test is tautological — it asserts elements are in the same list they were drawn from.

The assertion assert arch in ["sm_75", "sm_80", "sm_86"] will always pass because older_archs is ["sm_86", "sm_80", "sm_75"] — the exact same values. This test doesn't validate any real behavior (e.g., checking that cu121 actually supports these architectures).

🔧 Consider testing against a mapping instead
     def test_older_gpus_work_with_cu121(self):
         """Test that older GPUs (sm_86, sm_80) work with cu121."""
-        older_archs = ["sm_86", "sm_80", "sm_75"]  # RTX 3090, A100, RTX 2080
-
-        for arch in older_archs:
-            # cu121 supports these architectures
-            assert arch in ["sm_75", "sm_80", "sm_86"], \
-                f"{arch} should be supported by CUDA 12.1"
+        arch_to_min_cuda = {
+            "sm_86": "cu121",  # RTX 3090, A6000
+            "sm_80": "cu121",  # A100
+            "sm_75": "cu121",  # RTX 2080
+        }
+        for arch, min_cuda in arch_to_min_cuda.items():
+            assert min_cuda == "cu121", \
+                f"{arch} should be supported by CUDA 12.1"
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def test_older_gpus_work_with_cu121(self):
"""Test that older GPUs (sm_86, sm_80) work with cu121."""
older_archs = ["sm_86", "sm_80", "sm_75"] # RTX 3090, A100, RTX 2080
for arch in older_archs:
# cu121 supports these architectures
assert arch in ["sm_75", "sm_80", "sm_86"], \
f"{arch} should be supported by CUDA 12.1"
def test_older_gpus_work_with_cu121(self):
"""Test that older GPUs (sm_86, sm_80) work with cu121."""
arch_to_min_cuda = {
"sm_86": "cu121", # RTX 3090, A6000
"sm_80": "cu121", # A100
"sm_75": "cu121", # RTX 2080
}
for arch, min_cuda in arch_to_min_cuda.items():
assert min_cuda == "cu121", \
f"{arch} should be supported by CUDA 12.1"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@extras/asr-services/tests/test_cuda_version_config.py` around lines 172 -
179, The test test_older_gpus_work_with_cu121 is tautological because it asserts
each arch from older_archs is in the same hard-coded list; replace this with a
real validation by comparing older_archs to the authoritative CUDA 12.1
supported list (e.g., a constant or helper like CU121_SUPPORTED_ARCHS or
get_supported_archs_for_cuda("12.1")). Update the test to either assert
set(older_archs).issubset(set(CU121_SUPPORTED_ARCHS)) or iterate and assert each
arch is in CU121_SUPPORTED_ARCHS so the test actually verifies that CUDA 12.1
supports those architectures rather than re-checking the source list. Ensure you
reference the test name test_older_gpus_work_with_cu121 and the older_archs
variable when making the change.

Comment on lines +16 to +23
Vibevoice Url With Http Prefix Causes Double Prefix
[Documentation] Test that VIBEVOICE_ASR_URL WITH http:// causes double prefix (bug scenario).
${config_template}= Create Dictionary model_url=http://\${oc.env:VIBEVOICE_ASR_URL,host.docker.internal:8767}
${env_vars}= Create Dictionary VIBEVOICE_ASR_URL=http://host.docker.internal:8767

${resolved}= Resolve Omega Config ${config_template} ${env_vars}
Should Be Equal ${resolved["model_url"]} http://http://host.docker.internal:8767
Should Contain ${resolved["model_url"]} http://http://
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

# Find config.yml and related template files
fd "config" --type f -e yml -e yaml -e template | head -20

Repository: SimpleOpenSoftware/chronicle

Length of output: 163


🏁 Script executed:

# Check backends/advanced/init.py for the setup wizard defaults
fd "init.py" --path "*/backends/advanced/*" -type f

Repository: SimpleOpenSoftware/chronicle

Length of output: 287


🏁 Script executed:

# Check the test file to understand the exact pattern being tested
cat -n tests/configuration/test_transcription_url.robot | head -30

Repository: SimpleOpenSoftware/chronicle

Length of output: 1929


🏁 Script executed:

# Check backends directory for init.py
find backends -name "init.py" -type f

Repository: SimpleOpenSoftware/chronicle

Length of output: 97


🏁 Script executed:

# Read config/config.yml.template to see the ASR URL pattern
head -100 config/config.yml.template | grep -A 5 -B 5 "ASR_URL"

Repository: SimpleOpenSoftware/chronicle

Length of output: 54


🏁 Script executed:

# Search for VIBEVOICE_ASR_URL and PARAKEET_ASR_URL in the template
grep -n "VIBEVOICE_ASR_URL\|PARAKEET_ASR_URL" config/config.yml.template

Repository: SimpleOpenSoftware/chronicle

Length of output: 216


🏁 Script executed:

# Read backends/advanced/init.py around the setup wizard defaults
sed -n '260,300p' backends/advanced/init.py

Repository: SimpleOpenSoftware/chronicle

Length of output: 2553


🏁 Script executed:

# Get more context around Parakeet and VibeVoice setup
sed -n '260,300p' backends/advanced/init.py | cat -n

Repository: SimpleOpenSoftware/chronicle

Length of output: 2840


This test documents a real bug present in backends/advanced/init.py.

The test explicitly verifies that providing an http://-prefixed URL results in a double prefix (http://http://...). This exact scenario occurs with the setup wizard's defaults:

  • init.py lines 272 and 286 default both PARAKEET_ASR_URL and VIBEVOICE_ASR_URL to "http://host.docker.internal:8767" (with scheme)
  • The config template (lines 102 and 163) prepends http:// to the env var → resulting in http://http://...

The fix should remove the scheme from the setup wizard's default prompt values:

🐛 Proposed fix in backends/advanced/init.py
         elif choice == "2":
             self.console.print("[blue][INFO][/blue] Offline Parakeet ASR selected")
-            parakeet_url = self.prompt_value("Parakeet ASR URL", "http://host.docker.internal:8767")
+            parakeet_url = self.prompt_value("Parakeet ASR host:port (without http://)", "host.docker.internal:8767")
 
             # Write URL to .env for ${PARAKEET_ASR_URL} placeholder in config.yml
             self.config["PARAKEET_ASR_URL"] = parakeet_url
         elif choice == "3":
             self.console.print("[blue][INFO][/blue] Offline VibeVoice ASR selected (built-in speaker diarization)")
-            vibevoice_url = self.prompt_value("VibeVoice ASR URL", "http://host.docker.internal:8767")
+            vibevoice_url = self.prompt_value("VibeVoice ASR host:port (without http://)", "host.docker.internal:8767")
 
             # Write URL to .env for ${VIBEVOICE_ASR_URL} placeholder in config.yml
             self.config["VIBEVOICE_ASR_URL"] = vibevoice_url
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/configuration/test_transcription_url.robot` around lines 16 - 23, The
test shows a double-http prefix caused because backends/advanced/init.py sets
default prompt values for PARAKEET_ASR_URL and VIBEVOICE_ASR_URL with the
"http://" scheme, while the config template also prepends "http://"; fix by
removing the "http://" scheme from those default values in
backends/advanced/init.py (update the default strings for PARAKEET_ASR_URL and
VIBEVOICE_ASR_URL to e.g. host.docker.internal:8767) so the existing Resolve
Omega Config/template which prefixes http:// produces a single correct URL
(references: PARAKEET_ASR_URL, VIBEVOICE_ASR_URL, and the config template key
model_url).

Comment on lines +29 to +35
# We need to ensure values are strings for os.environ
str_env_vars = {k: str(v) for k, v in env_vars.items()}

with patch.dict(os.environ, str_env_vars):
conf = OmegaConf.create(config_template)
resolved = OmegaConf.to_container(conf, resolve=True)
return resolved
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

patch.dict(os.environ, ...) is not thread-safe for parallel test runs.

os.environ is process-global. If tests run concurrently (e.g., via Pabot), two calls to resolve_omega_config with different env_vars will race and could read each other's patched values, producing flaky results.

Consider a subprocess-isolated approach or documenting that this helper is not safe under parallel execution.

🔒 Alternative: explicit resolver instead of env patching
-        str_env_vars = {k: str(v) for k, v in env_vars.items()}
-        
-        with patch.dict(os.environ, str_env_vars):
-            conf = OmegaConf.create(config_template)
-            resolved = OmegaConf.to_container(conf, resolve=True)
-            return resolved
+        str_env_vars = {k: str(v) for k, v in env_vars.items()}
+        # Register a one-shot resolver backed by the provided dict so we never
+        # touch os.environ (avoids races in parallel test runners).
+        OmegaConf.register_new_resolver(
+            "oc.env", lambda key, default="": str_env_vars.get(key, default),
+            replace=True,
+        )
+        try:
+            conf = OmegaConf.create(config_template)
+            resolved = OmegaConf.to_container(conf, resolve=True)
+            return resolved
+        finally:
+            OmegaConf.clear_resolvers()
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/libs/ConfigTestHelper.py` around lines 29 - 35, The current
resolve_omega_config helper patches os.environ (via patch.dict) which is
process-global and not thread-safe; instead, modify resolve_omega_config to
avoid patching os.environ and register an explicit OmegaConf resolver that reads
from the provided env_vars dict (falling back to os.environ) so resolution is
isolated per call: convert env_vars values to strings, register a resolver
(OmegaConf.register_resolver) or use a custom function passed into OmegaConf
that returns env_vars.get(key) || os.environ.get(key), create the config with
OmegaConf.create(config_template), then call OmegaConf.to_container(conf,
resolve=True) and return the result; ensure you unregister or scope the resolver
per-call if necessary to avoid global side effects.

Comment on lines +49 to +59
def create_temp_config_structure(self, base_path: str, content: Dict[str, Any]) -> str:
"""
Creates the config folder structure and config.yml within the given base path.
"""
content = self._to_dict(content)
path = Path(base_path) / "config"
path.mkdir(parents=True, exist_ok=True)
config_file = path / "config.yml"
with open(config_file, "w") as f:
yaml.dump(content, f, default_flow_style=False, sort_keys=False)
return str(base_path)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

No cleanup counterpart — temp files will persist on test failure.

create_temp_config_structure writes real files to disk but the helper provides no cleanup_temp_config_structure (or equivalent) method. If a Robot Framework test fails before its [Teardown] runs, the temp directory is left behind. With a fixed base_path (e.g. a predictable path under /tmp), a previous run's stale config.yml could silently influence the next run.

Adding a sibling helper ensures callers have a safe, symmetric API:

🧹 Proposed addition
import shutil

def cleanup_temp_config_structure(self, base_path: str) -> None:
    """Removes the config directory created by create_temp_config_structure."""
    config_dir = Path(base_path) / "config"
    shutil.rmtree(config_dir, ignore_errors=True)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/libs/ConfigTestHelper.py` around lines 49 - 59, Add a symmetric cleanup
helper to remove the temp config directory created by
create_temp_config_structure: implement a method named
cleanup_temp_config_structure(self, base_path: str) that imports shutil,
computes config_dir = Path(base_path) / "config" and calls
shutil.rmtree(config_dir, ignore_errors=True) so stale config.yml files are
removed even if tests fail; ensure the new method is documented and intended to
be called from test teardowns.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants